Speech recognition in noisy environments using a switching linear dynamic model for feature enhancement
نویسندگان
چکیده
The performance of automatic speech recognition systems strongly decreases whenever the speech signal is disturbed by background noise. We aim to improve noise robustness focusing on all major levels of speech recognition: feature extraction, feature enhancement, and speech modeling. Different auditory modeling concepts, speech enhancement techniques, training strategies, and model architectures are implemented in an in-car digit and spelling recognition task. We prove that joint speech and noise modeling with a global Switching Linear Dynamic Model (SLDM) capturing the dynamics of speech, and a Linear Dynamic Model (LDM) for noise, prevails over state-of-theart speech enhancement techniques. Furthermore we show that the baseline recognizer of the Interspeech Consonant Challenge 2008 can be outperformed by SLDM feature enhancement for almost all of the noisy testsets.
منابع مشابه
Noisy Speech Feature Estimation on the Aurora2 Database using a Switching Linear Dynamic Model
This paper presents an approach to enhance speech feature estimation in the log spectral domain under additive noise environments. A switching linear dynamic model (SLDM) is explored as a parametric model for the clean speech distribution, enforcing a state transition in the feature space and capturing the smooth time evolution of speech conditioned on the state sequence. Experimental results u...
متن کاملRecognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement
Performance of speech recognition systems strongly degrades in the presence of background noise, like the driving noise inside a car. In contrast to existing works, we aim to improve noise robustness focusing on all major levels of speech recognition: feature extraction, feature enhancement, speech modelling, and training. Thereby, we give an overview of promising auditory modelling concepts, s...
متن کاملDynamic Robust Speech Recognition
Robust recognition theory has become one of research focuses of acoustic speech recognition. Acoustic speech digital signal is a random process repeatedly alternating stationary pieces with non-stationary pieces. However both the current linear and stationary characteristic parameters drawn from such signals and the rigid recognition models do not adapt to such repeatedly alternating property o...
متن کاملAn approach to iterative speech feature enhancement and recognition
In this paper we propose a novel iterative speech feature enhancement and recognition architecture for noisy speech recognition. It consists of model-based feature enhancement employing Switching Linear Dynamical Models (SLDM), a hidden Markov Model (HMM) decoder and a state mapper, which maps HMM to SLDM states. To consistently adhere to a Bayesian paradigm, posteriors are exchanged between th...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کامل